Overview

Brought to you by YData

Dataset statistics

Number of variables25
Number of observations105542
Missing cells416
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.1 MiB
Average record size in memory200.0 B

Variable types

Numeric10
Text5
Categorical10

Alerts

article_id is highly overall correlated with product_codeHigh correlation
colour_group_code is highly overall correlated with colour_group_name and 1 other fieldsHigh correlation
colour_group_name is highly overall correlated with colour_group_code and 5 other fieldsHigh correlation
department_no is highly overall correlated with index_code and 3 other fieldsHigh correlation
garment_group_name is highly overall correlated with garment_group_no and 1 other fieldsHigh correlation
garment_group_no is highly overall correlated with garment_group_name and 1 other fieldsHigh correlation
graphical_appearance_name is highly overall correlated with graphical_appearance_noHigh correlation
graphical_appearance_no is highly overall correlated with colour_group_name and 2 other fieldsHigh correlation
index_code is highly overall correlated with department_no and 4 other fieldsHigh correlation
index_group_name is highly overall correlated with department_no and 4 other fieldsHigh correlation
index_group_no is highly overall correlated with department_no and 4 other fieldsHigh correlation
index_name is highly overall correlated with department_no and 4 other fieldsHigh correlation
perceived_colour_master_id is highly overall correlated with colour_group_name and 1 other fieldsHigh correlation
perceived_colour_master_name is highly overall correlated with colour_group_code and 4 other fieldsHigh correlation
perceived_colour_value_id is highly overall correlated with colour_group_name and 2 other fieldsHigh correlation
perceived_colour_value_name is highly overall correlated with colour_group_name and 3 other fieldsHigh correlation
product_code is highly overall correlated with article_idHigh correlation
product_group_name is highly overall correlated with garment_group_name and 2 other fieldsHigh correlation
product_type_no is highly overall correlated with product_group_nameHigh correlation
section_no is highly overall correlated with index_code and 3 other fieldsHigh correlation
graphical_appearance_no is highly skewed (γ1 = -45.01901161)Skewed
article_id has unique valuesUnique

Reproduction

Analysis started2025-11-20 16:47:04.208986
Analysis finished2025-11-20 16:47:31.831505
Duration27.62 seconds
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

article_id
Real number (ℝ)

High correlation  Unique 

Distinct105542
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.9842457 × 108
Minimum1.0877502 × 108
Maximum9.59461 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:32.012608image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum1.0877502 × 108
5-th percentile4.9381002 × 108
Q16.169925 × 108
median7.02213 × 108
Q37.96703 × 108
95-th percentile8.8937901 × 108
Maximum9.59461 × 108
Range8.5068599 × 108
Interquartile range (IQR)1.797105 × 108

Descriptive statistics

Standard deviation1.2846238 × 108
Coefficient of variation (CV)0.18393165
Kurtosis0.66097576
Mean6.9842457 × 108
Median Absolute Deviation (MAD)90074996
Skewness-0.57728335
Sum7.3713126 × 1013
Variance1.6502583 × 1016
MonotonicityStrictly increasing
2025-11-20T16:47:32.230901image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9594610011
 
< 0.1%
1087750151
 
< 0.1%
1087750441
 
< 0.1%
1087750511
 
< 0.1%
1100650011
 
< 0.1%
1100650021
 
< 0.1%
9465270011
 
< 0.1%
9467480011
 
< 0.1%
9467480031
 
< 0.1%
9467480041
 
< 0.1%
Other values (105532)105532
> 99.9%
ValueCountFrequency (%)
1087750151
< 0.1%
1087750441
< 0.1%
1087750511
< 0.1%
1100650011
< 0.1%
1100650021
< 0.1%
1100650111
< 0.1%
1115650011
< 0.1%
1115650031
< 0.1%
1115860011
< 0.1%
1115930011
< 0.1%
ValueCountFrequency (%)
9594610011
< 0.1%
9573750011
< 0.1%
9562170021
< 0.1%
9537630011
< 0.1%
9534500011
< 0.1%
9529380011
< 0.1%
9529370031
< 0.1%
9522670011
< 0.1%
9504490021
< 0.1%
9495940011
< 0.1%

product_code
Real number (ℝ)

High correlation 

Distinct47224
Distinct (%)44.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean698424.56
Minimum108775
Maximum959461
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:32.499321image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum108775
5-th percentile493810
Q1616992.5
median702213
Q3796703
95-th percentile889379
Maximum959461
Range850686
Interquartile range (IQR)179710.5

Descriptive statistics

Standard deviation128462.38
Coefficient of variation (CV)0.18393165
Kurtosis0.66097587
Mean698424.56
Median Absolute Deviation (MAD)90075
Skewness-0.57728339
Sum7.3713125 × 1010
Variance1.6502584 × 1010
MonotonicityIncreasing
2025-11-20T16:47:32.719655image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
78370775
 
0.1%
68402170
 
0.1%
69992352
 
< 0.1%
69975549
 
< 0.1%
68560446
 
< 0.1%
73965944
 
< 0.1%
68581641
 
< 0.1%
66407441
 
< 0.1%
57000241
 
< 0.1%
56224541
 
< 0.1%
Other values (47214)105042
99.5%
ValueCountFrequency (%)
1087753
< 0.1%
1100653
< 0.1%
1115652
 
< 0.1%
1115861
 
< 0.1%
1115931
 
< 0.1%
1116091
 
< 0.1%
1126792
 
< 0.1%
1144282
 
< 0.1%
1163791
 
< 0.1%
1184587
< 0.1%
ValueCountFrequency (%)
9594611
< 0.1%
9573751
< 0.1%
9562171
< 0.1%
9537631
< 0.1%
9534501
< 0.1%
9529381
< 0.1%
9529371
< 0.1%
9522671
< 0.1%
9504491
< 0.1%
9495941
< 0.1%
Distinct45875
Distinct (%)43.5%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:33.065346image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length30
Median length23
Mean length15.535569
Min length1

Characters and Unicode

Total characters1639655
Distinct characters91
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22920 ?
Unique (%)21.7%

Sample

1st rowStrap top
2nd rowStrap top
3rd rowStrap top (1)
4th rowOP T-shirt (Idro)
5th rowOP T-shirt (Idro)
ValueCountFrequency (%)
dress7825
 
2.6%
tee4553
 
1.5%
top3938
 
1.3%
shorts3555
 
1.2%
fancy2796
 
0.9%
ls2336
 
0.8%
hood2294
 
0.8%
sb2252
 
0.8%
set2133
 
0.7%
12043
 
0.7%
Other values (13649)261891
88.6%
2025-11-20T16:47:33.623940image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
190600
 
11.6%
e116144
 
7.1%
a94570
 
5.8%
s79849
 
4.9%
r78145
 
4.8%
i76131
 
4.6%
o67798
 
4.1%
n65393
 
4.0%
t63950
 
3.9%
l58420
 
3.6%
Other values (81)748655
45.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)1639655
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
190600
 
11.6%
e116144
 
7.1%
a94570
 
5.8%
s79849
 
4.9%
r78145
 
4.8%
i76131
 
4.6%
o67798
 
4.1%
n65393
 
4.0%
t63950
 
3.9%
l58420
 
3.6%
Other values (81)748655
45.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1639655
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
190600
 
11.6%
e116144
 
7.1%
a94570
 
5.8%
s79849
 
4.9%
r78145
 
4.8%
i76131
 
4.6%
o67798
 
4.1%
n65393
 
4.0%
t63950
 
3.9%
l58420
 
3.6%
Other values (81)748655
45.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1639655
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
190600
 
11.6%
e116144
 
7.1%
a94570
 
5.8%
s79849
 
4.9%
r78145
 
4.8%
i76131
 
4.6%
o67798
 
4.1%
n65393
 
4.0%
t63950
 
3.9%
l58420
 
3.6%
Other values (81)748655
45.7%

product_type_no
Real number (ℝ)

High correlation 

Distinct132
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean234.86187
Minimum-1
Maximum762
Zeros0
Zeros (%)0.0%
Negative121
Negative (%)0.1%
Memory size824.7 KiB
2025-11-20T16:47:33.828905image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile70
Q1252
median259
Q3272
95-th percentile304
Maximum762
Range763
Interquartile range (IQR)20

Descriptive statistics

Standard deviation75.049308
Coefficient of variation (CV)0.31954658
Kurtosis1.1655822
Mean234.86187
Median Absolute Deviation (MAD)13
Skewness-1.4230313
Sum24787792
Variance5632.3986
MonotonicityNot monotonic
2025-11-20T16:47:34.053343image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27211169
 
10.6%
26510362
 
9.8%
2529302
 
8.8%
2557904
 
7.5%
2544155
 
3.9%
2583979
 
3.8%
2623940
 
3.7%
2743939
 
3.7%
2593405
 
3.2%
2532991
 
2.8%
Other values (122)44396
42.1%
ValueCountFrequency (%)
-1121
 
0.1%
4948
 
< 0.1%
57662
0.6%
591307
1.2%
6050
 
< 0.1%
661280
1.2%
67458
 
0.4%
68180
 
0.2%
69573
0.5%
701159
1.1%
ValueCountFrequency (%)
7623
 
< 0.1%
7615
 
< 0.1%
5323
 
< 0.1%
5294
 
< 0.1%
5251
 
< 0.1%
5232
 
< 0.1%
5217
 
< 0.1%
5156
 
< 0.1%
5141
 
< 0.1%
51224
< 0.1%
Distinct131
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:34.413533image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length24
Median length19
Mean length7.5308787
Min length3

Characters and Unicode

Total characters794824
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)< 0.1%

Sample

1st rowVest top
2nd rowVest top
3rd rowVest top
4th rowBra
5th rowBra
ValueCountFrequency (%)
trousers11299
 
9.2%
dress10362
 
8.5%
sweater9302
 
7.6%
top8142
 
6.7%
t-shirt7904
 
6.5%
bottom4275
 
3.5%
blouse3979
 
3.3%
jacket3940
 
3.2%
shorts3939
 
3.2%
shirt3854
 
3.1%
Other values (140)55357
45.2%
2025-11-20T16:47:34.909664image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e87702
 
11.0%
r86934
 
10.9%
s86754
 
10.9%
t65600
 
8.3%
o51144
 
6.4%
a50695
 
6.4%
i40748
 
5.1%
S29213
 
3.7%
T25917
 
3.3%
u23025
 
2.9%
Other values (41)247092
31.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)794824
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e87702
 
11.0%
r86934
 
10.9%
s86754
 
10.9%
t65600
 
8.3%
o51144
 
6.4%
a50695
 
6.4%
i40748
 
5.1%
S29213
 
3.7%
T25917
 
3.3%
u23025
 
2.9%
Other values (41)247092
31.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)794824
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e87702
 
11.0%
r86934
 
10.9%
s86754
 
10.9%
t65600
 
8.3%
o51144
 
6.4%
a50695
 
6.4%
i40748
 
5.1%
S29213
 
3.7%
T25917
 
3.3%
u23025
 
2.9%
Other values (41)247092
31.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)794824
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e87702
 
11.0%
r86934
 
10.9%
s86754
 
10.9%
t65600
 
8.3%
o51144
 
6.4%
a50695
 
6.4%
i40748
 
5.1%
S29213
 
3.7%
T25917
 
3.3%
u23025
 
2.9%
Other values (41)247092
31.1%

product_group_name
Categorical

High correlation 

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Garment Upper body
42741 
Garment Lower body
19812 
Garment Full body
13292 
Accessories
11158 
Underwear
5490 
Other values (14)
13049 

Length

Max length21
Median length18
Mean length15.44064
Min length3

Characters and Unicode

Total characters1629636
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGarment Upper body
2nd rowGarment Upper body
3rd rowGarment Upper body
4th rowUnderwear
5th rowUnderwear

Common Values

ValueCountFrequency (%)
Garment Upper body42741
40.5%
Garment Lower body19812
18.8%
Garment Full body13292
 
12.6%
Accessories11158
 
10.6%
Underwear5490
 
5.2%
Shoes5283
 
5.0%
Swimwear3127
 
3.0%
Socks & Tights2442
 
2.3%
Nightwear1899
 
1.8%
Unknown121
 
0.1%
Other values (9)177
 
0.2%

Length

2025-11-20T16:47:35.119651image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
garment75854
28.9%
body75845
28.9%
upper42741
16.3%
lower19812
 
7.6%
full13292
 
5.1%
accessories11158
 
4.3%
underwear5490
 
2.1%
shoes5283
 
2.0%
swimwear3127
 
1.2%
socks2442
 
0.9%
Other values (16)7102
 
2.7%

Most occurring characters

ValueCountFrequency (%)
e182285
 
11.2%
r165779
 
10.2%
156604
 
9.6%
o114727
 
7.0%
a86526
 
5.3%
p85482
 
5.2%
n81847
 
5.0%
d81398
 
5.0%
t80347
 
4.9%
m79047
 
4.9%
Other values (25)515594
31.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)1629636
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e182285
 
11.2%
r165779
 
10.2%
156604
 
9.6%
o114727
 
7.0%
a86526
 
5.3%
p85482
 
5.2%
n81847
 
5.0%
d81398
 
5.0%
t80347
 
4.9%
m79047
 
4.9%
Other values (25)515594
31.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1629636
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e182285
 
11.2%
r165779
 
10.2%
156604
 
9.6%
o114727
 
7.0%
a86526
 
5.3%
p85482
 
5.2%
n81847
 
5.0%
d81398
 
5.0%
t80347
 
4.9%
m79047
 
4.9%
Other values (25)515594
31.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1629636
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e182285
 
11.2%
r165779
 
10.2%
156604
 
9.6%
o114727
 
7.0%
a86526
 
5.3%
p85482
 
5.2%
n81847
 
5.0%
d81398
 
5.0%
t80347
 
4.9%
m79047
 
4.9%
Other values (25)515594
31.6%

graphical_appearance_no
Real number (ℝ)

High correlation  Skewed 

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1009515.1
Minimum-1
Maximum1010029
Zeros0
Zeros (%)0.0%
Negative52
Negative (%)< 0.1%
Memory size824.7 KiB
2025-11-20T16:47:35.325032image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile1010001
Q11010008
median1010016
Q31010016
95-th percentile1010023
Maximum1010029
Range1010030
Interquartile range (IQR)8

Descriptive statistics

Standard deviation22413.586
Coefficient of variation (CV)0.022202329
Kurtosis2024.75
Mean1009515.1
Median Absolute Deviation (MAD)1
Skewness-45.019012
Sum1.0654624 × 1011
Variance5.0236883 × 108
MonotonicityNot monotonic
2025-11-20T16:47:35.525067image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
101001649747
47.1%
101000117165
 
16.3%
10100105938
 
5.6%
10100174990
 
4.7%
10100234842
 
4.6%
10100083215
 
3.0%
10100143098
 
2.9%
10100042178
 
2.1%
10100051830
 
1.7%
10100211513
 
1.4%
Other values (20)11026
 
10.4%
ValueCountFrequency (%)
-152
 
< 0.1%
101000117165
16.3%
10100021341
 
1.3%
101000315
 
< 0.1%
10100042178
 
2.1%
10100051830
 
1.7%
1010006681
 
0.6%
10100071165
 
1.1%
10100083215
 
3.0%
1010009958
 
0.9%
ValueCountFrequency (%)
10100298
 
< 0.1%
101002886
 
0.1%
101002766
 
0.1%
10100261502
 
1.4%
1010025153
 
0.1%
1010024322
 
0.3%
10100234842
4.6%
1010022830
 
0.8%
10100211513
 
1.4%
1010020376
 
0.4%

graphical_appearance_name
Categorical

High correlation 

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Solid
49747 
All over pattern
17165 
Melange
5938 
Stripe
4990 
Denim
 
4842
Other values (25)
22860 

Length

Max length19
Median length5
Mean length8.2858578
Min length3

Characters and Unicode

Total characters874506
Distinct characters42
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSolid
2nd rowSolid
3rd rowStripe
4th rowSolid
5th rowSolid

Common Values

ValueCountFrequency (%)
Solid49747
47.1%
All over pattern17165
 
16.3%
Melange5938
 
5.6%
Stripe4990
 
4.7%
Denim4842
 
4.6%
Front print3215
 
3.0%
Placement print3098
 
2.9%
Check2178
 
2.1%
Colour blocking1830
 
1.7%
Lace1513
 
1.4%
Other values (20)11026
 
10.4%

Length

2025-11-20T16:47:35.739851image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
solid49747
32.9%
pattern17680
 
11.7%
all17165
 
11.4%
over17165
 
11.4%
print6313
 
4.2%
melange5938
 
3.9%
stripe4990
 
3.3%
denim4842
 
3.2%
front3215
 
2.1%
placement3098
 
2.0%
Other values (25)21011
13.9%

Most occurring characters

ValueCountFrequency (%)
l102988
11.8%
o80380
 
9.2%
e77881
 
8.9%
i77859
 
8.9%
t67513
 
7.7%
r62943
 
7.2%
S55696
 
6.4%
d54006
 
6.2%
n48443
 
5.5%
45622
 
5.2%
Other values (32)201175
23.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)874506
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
l102988
11.8%
o80380
 
9.2%
e77881
 
8.9%
i77859
 
8.9%
t67513
 
7.7%
r62943
 
7.2%
S55696
 
6.4%
d54006
 
6.2%
n48443
 
5.5%
45622
 
5.2%
Other values (32)201175
23.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)874506
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
l102988
11.8%
o80380
 
9.2%
e77881
 
8.9%
i77859
 
8.9%
t67513
 
7.7%
r62943
 
7.2%
S55696
 
6.4%
d54006
 
6.2%
n48443
 
5.5%
45622
 
5.2%
Other values (32)201175
23.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)874506
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
l102988
11.8%
o80380
 
9.2%
e77881
 
8.9%
i77859
 
8.9%
t67513
 
7.7%
r62943
 
7.2%
S55696
 
6.4%
d54006
 
6.2%
n48443
 
5.5%
45622
 
5.2%
Other values (32)201175
23.0%

colour_group_code
Real number (ℝ)

High correlation 

Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.233822
Minimum-1
Maximum93
Zeros0
Zeros (%)0.0%
Negative28
Negative (%)< 0.1%
Memory size824.7 KiB
2025-11-20T16:47:35.949678image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile7
Q19
median14
Q352
95-th percentile81
Maximum93
Range94
Interquartile range (IQR)43

Descriptive statistics

Standard deviation28.086154
Coefficient of variation (CV)0.87132561
Kurtosis-1.0610471
Mean32.233822
Median Absolute Deviation (MAD)7
Skewness0.7138227
Sum3402022
Variance788.83205
MonotonicityNot monotonic
2025-11-20T16:47:36.195673image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
922670
21.5%
7312171
 
11.5%
109542
 
9.0%
515811
 
5.5%
74487
 
4.3%
123356
 
3.2%
723308
 
3.1%
423056
 
2.9%
713012
 
2.9%
192767
 
2.6%
Other values (40)35362
33.5%
ValueCountFrequency (%)
-128
 
< 0.1%
1105
 
0.1%
231
 
< 0.1%
3709
 
0.7%
494
 
0.1%
51377
 
1.3%
62105
 
2.0%
74487
 
4.3%
82731
 
2.6%
922670
21.5%
ValueCountFrequency (%)
932106
 
2.0%
92815
 
0.8%
91681
 
0.6%
90129
 
0.1%
83473
 
0.4%
82435
 
0.4%
811027
 
1.0%
8014
 
< 0.1%
7312171
11.5%
723308
 
3.1%

colour_group_name
Categorical

High correlation 

Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Black
22670 
Dark Blue
12171 
White
9542 
Light Pink
5811 
Grey
 
4487
Other values (45)
50861 

Length

Max length15
Median length14
Mean length7.4805101
Min length3

Characters and Unicode

Total characters789508
Distinct characters38
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBlack
2nd rowWhite
3rd rowOff White
4th rowBlack
5th rowWhite

Common Values

ValueCountFrequency (%)
Black22670
21.5%
Dark Blue12171
 
11.5%
White9542
 
9.0%
Light Pink5811
 
5.5%
Grey4487
 
4.3%
Light Beige3356
 
3.2%
Blue3308
 
3.1%
Red3056
 
2.9%
Light Blue3012
 
2.9%
Greenish Khaki2767
 
2.6%
Other values (40)35362
33.5%

Length

2025-11-20T16:47:36.413811image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
dark23498
15.0%
black22670
14.4%
light19334
12.3%
blue18542
11.8%
white12268
7.8%
pink9442
 
6.0%
grey9323
 
5.9%
beige7378
 
4.7%
red5795
 
3.7%
green3731
 
2.4%
Other values (16)25065
16.0%

Most occurring characters

ValueCountFrequency (%)
e87703
 
11.1%
k58405
 
7.4%
i58311
 
7.4%
l54192
 
6.9%
a52335
 
6.6%
51504
 
6.5%
B50155
 
6.4%
r49945
 
6.3%
h40420
 
5.1%
t33220
 
4.2%
Other values (28)253318
32.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)789508
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e87703
 
11.1%
k58405
 
7.4%
i58311
 
7.4%
l54192
 
6.9%
a52335
 
6.6%
51504
 
6.5%
B50155
 
6.4%
r49945
 
6.3%
h40420
 
5.1%
t33220
 
4.2%
Other values (28)253318
32.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)789508
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e87703
 
11.1%
k58405
 
7.4%
i58311
 
7.4%
l54192
 
6.9%
a52335
 
6.6%
51504
 
6.5%
B50155
 
6.4%
r49945
 
6.3%
h40420
 
5.1%
t33220
 
4.2%
Other values (28)253318
32.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)789508
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e87703
 
11.1%
k58405
 
7.4%
i58311
 
7.4%
l54192
 
6.9%
a52335
 
6.6%
51504
 
6.5%
B50155
 
6.4%
r49945
 
6.3%
h40420
 
5.1%
t33220
 
4.2%
Other values (28)253318
32.1%

perceived_colour_value_id
Real number (ℝ)

High correlation 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2061833
Minimum-1
Maximum7
Zeros0
Zeros (%)0.0%
Negative28
Negative (%)< 0.1%
Memory size824.7 KiB
2025-11-20T16:47:36.553856image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile1
Q12
median4
Q34
95-th percentile7
Maximum7
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5638389
Coefficient of variation (CV)0.48775718
Kurtosis-0.094881985
Mean3.2061833
Median Absolute Deviation (MAD)1
Skewness0.27399945
Sum338387
Variance2.4455922
MonotonicityNot monotonic
2025-11-20T16:47:36.733574image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
442706
40.5%
122152
21.0%
315739
 
14.9%
212630
 
12.0%
56471
 
6.1%
75711
 
5.4%
6105
 
0.1%
-128
 
< 0.1%
ValueCountFrequency (%)
-128
 
< 0.1%
122152
21.0%
212630
 
12.0%
315739
 
14.9%
442706
40.5%
56471
 
6.1%
6105
 
0.1%
75711
 
5.4%
ValueCountFrequency (%)
75711
 
5.4%
6105
 
0.1%
56471
 
6.1%
442706
40.5%
315739
 
14.9%
212630
 
12.0%
122152
21.0%
-128
 
< 0.1%

perceived_colour_value_name
Categorical

High correlation 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Dark
42706 
Dusty Light
22152 
Light
15739 
Medium Dusty
12630 
Bright
6471 
Other values (3)
5844 

Length

Max length12
Median length11
Mean length6.8123022
Min length4

Characters and Unicode

Total characters718984
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDark
2nd rowLight
3rd rowDusty Light
4th rowDark
5th rowLight

Common Values

ValueCountFrequency (%)
Dark42706
40.5%
Dusty Light22152
21.0%
Light15739
 
14.9%
Medium Dusty12630
 
12.0%
Bright6471
 
6.1%
Medium5711
 
5.4%
Undefined105
 
0.1%
Unknown28
 
< 0.1%

Length

2025-11-20T16:47:36.948680image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-20T16:47:37.154073image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
dark42706
30.4%
light37891
27.0%
dusty34782
24.8%
medium18341
13.1%
bright6471
 
4.6%
undefined105
 
0.1%
unknown28
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
t79144
11.0%
D77488
10.8%
i62808
 
8.7%
u53123
 
7.4%
r49177
 
6.8%
g44362
 
6.2%
h44362
 
6.2%
k42734
 
5.9%
a42706
 
5.9%
L37891
 
5.3%
Other values (13)185189
25.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)718984
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t79144
11.0%
D77488
10.8%
i62808
 
8.7%
u53123
 
7.4%
r49177
 
6.8%
g44362
 
6.2%
h44362
 
6.2%
k42734
 
5.9%
a42706
 
5.9%
L37891
 
5.3%
Other values (13)185189
25.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)718984
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t79144
11.0%
D77488
10.8%
i62808
 
8.7%
u53123
 
7.4%
r49177
 
6.8%
g44362
 
6.2%
h44362
 
6.2%
k42734
 
5.9%
a42706
 
5.9%
L37891
 
5.3%
Other values (13)185189
25.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)718984
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t79144
11.0%
D77488
10.8%
i62808
 
8.7%
u53123
 
7.4%
r49177
 
6.8%
g44362
 
6.2%
h44362
 
6.2%
k42734
 
5.9%
a42706
 
5.9%
L37891
 
5.3%
Other values (13)185189
25.8%

perceived_colour_master_id
Real number (ℝ)

High correlation 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.8079722
Minimum-1
Maximum20
Zeros0
Zeros (%)0.0%
Negative685
Negative (%)0.6%
Memory size824.7 KiB
2025-11-20T16:47:37.389048image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile2
Q14
median5
Q311
95-th percentile19
Maximum20
Range21
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.376727
Coefficient of variation (CV)0.68862015
Kurtosis-0.36204043
Mean7.8079722
Median Absolute Deviation (MAD)3
Skewness0.80137952
Sum824069
Variance28.909193
MonotonicityNot monotonic
2025-11-20T16:47:37.539186image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
522585
21.4%
218469
17.5%
912665
12.0%
49403
8.9%
128924
 
8.5%
185878
 
5.6%
115657
 
5.4%
193526
 
3.3%
203181
 
3.0%
83121
 
3.0%
Other values (10)12133
11.5%
ValueCountFrequency (%)
-1685
 
0.6%
11223
 
1.2%
218469
17.5%
32734
 
2.6%
49403
8.9%
522585
21.4%
61100
 
1.0%
71829
 
1.7%
83121
 
3.0%
912665
12.0%
ValueCountFrequency (%)
203181
 
3.0%
193526
 
3.3%
185878
5.6%
163
 
< 0.1%
152180
 
2.1%
14105
 
0.1%
132269
 
2.1%
128924
8.5%
115657
5.4%
105
 
< 0.1%

perceived_colour_master_name
Categorical

High correlation 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Black
22585 
Blue
18469 
White
12665 
Pink
9403 
Grey
8924 
Other values (15)
33496 

Length

Max length15
Median length12
Mean length4.9246082
Min length3

Characters and Unicode

Total characters519753
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBlack
2nd rowWhite
3rd rowWhite
4th rowBlack
5th rowWhite

Common Values

ValueCountFrequency (%)
Black22585
21.4%
Blue18469
17.5%
White12665
12.0%
Pink9403
8.9%
Grey8924
 
8.5%
Red5878
 
5.6%
Beige5657
 
5.4%
Green3526
 
3.3%
Khaki green3181
 
3.0%
Yellow3121
 
3.0%
Other values (10)12133
11.5%

Length

2025-11-20T16:47:37.733925image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
black22585
20.6%
blue18469
16.8%
white12665
11.5%
pink9403
8.6%
grey8924
 
8.1%
green6715
 
6.1%
red5878
 
5.4%
beige5657
 
5.2%
khaki3181
 
2.9%
yellow3121
 
2.8%
Other values (11)13233
12.0%

Most occurring characters

ValueCountFrequency (%)
e83082
16.0%
l52912
 
10.2%
B48983
 
9.4%
k35854
 
6.9%
i33948
 
6.5%
a31780
 
6.1%
c23685
 
4.6%
r23571
 
4.5%
n23386
 
4.5%
u23335
 
4.5%
Other values (23)139217
26.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)519753
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e83082
16.0%
l52912
 
10.2%
B48983
 
9.4%
k35854
 
6.9%
i33948
 
6.5%
a31780
 
6.1%
c23685
 
4.6%
r23571
 
4.5%
n23386
 
4.5%
u23335
 
4.5%
Other values (23)139217
26.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)519753
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e83082
16.0%
l52912
 
10.2%
B48983
 
9.4%
k35854
 
6.9%
i33948
 
6.5%
a31780
 
6.1%
c23685
 
4.6%
r23571
 
4.5%
n23386
 
4.5%
u23335
 
4.5%
Other values (23)139217
26.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)519753
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e83082
16.0%
l52912
 
10.2%
B48983
 
9.4%
k35854
 
6.9%
i33948
 
6.5%
a31780
 
6.1%
c23685
 
4.6%
r23571
 
4.5%
n23386
 
4.5%
u23335
 
4.5%
Other values (23)139217
26.8%

department_no
Real number (ℝ)

High correlation 

Distinct299
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4532.7778
Minimum1201
Maximum9989
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:37.939861image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum1201
5-th percentile1338
Q11676
median4222
Q37389
95-th percentile8748
Maximum9989
Range8788
Interquartile range (IQR)5713

Descriptive statistics

Standard deviation2712.692
Coefficient of variation (CV)0.59846128
Kurtosis-1.3964267
Mean4532.7778
Median Absolute Deviation (MAD)2556
Skewness0.27135387
Sum4.7839844 × 108
Variance7358697.9
MonotonicityNot monotonic
2025-11-20T16:47:38.364052image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
76162032
 
1.9%
13381921
 
1.8%
87161874
 
1.8%
42421839
 
1.7%
76481488
 
1.4%
16401429
 
1.4%
16361402
 
1.3%
16761359
 
1.3%
13441354
 
1.3%
16431339
 
1.3%
Other values (289)89505
84.8%
ValueCountFrequency (%)
1201829
0.8%
120216
 
< 0.1%
1212299
 
0.3%
1222238
 
0.2%
124187
 
0.1%
1244667
0.6%
1310251
 
0.2%
1313630
0.6%
13221206
1.1%
1334864
0.8%
ValueCountFrequency (%)
9989122
 
0.1%
9986513
0.5%
9985579
0.5%
9984236
0.2%
902033
 
< 0.1%
8956363
0.3%
8917421
0.4%
8888269
0.3%
8852281
0.3%
881521
 
< 0.1%
Distinct250
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:38.654998image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length40
Median length26
Mean length13.140219
Min length2

Characters and Unicode

Total characters1386845
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowJersey Basic
2nd rowJersey Basic
3rd rowJersey Basic
4th rowClean Lingerie
5th rowClean Lingerie
ValueCountFrequency (%)
jersey24170
 
10.5%
girl16349
 
7.1%
kids14307
 
6.2%
fancy13087
 
5.7%
boy11674
 
5.1%
young10428
 
4.5%
baby7973
 
3.5%
knitwear7498
 
3.2%
basic7078
 
3.1%
woven6640
 
2.9%
Other values (132)111638
48.4%
2025-11-20T16:47:39.226257image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e142984
 
10.3%
s126069
 
9.1%
125300
 
9.0%
r110268
 
8.0%
i87155
 
6.3%
o77105
 
5.6%
a65051
 
4.7%
y61342
 
4.4%
n54902
 
4.0%
c42943
 
3.1%
Other values (50)493726
35.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)1386845
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e142984
 
10.3%
s126069
 
9.1%
125300
 
9.0%
r110268
 
8.0%
i87155
 
6.3%
o77105
 
5.6%
a65051
 
4.7%
y61342
 
4.4%
n54902
 
4.0%
c42943
 
3.1%
Other values (50)493726
35.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1386845
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e142984
 
10.3%
s126069
 
9.1%
125300
 
9.0%
r110268
 
8.0%
i87155
 
6.3%
o77105
 
5.6%
a65051
 
4.7%
y61342
 
4.4%
n54902
 
4.0%
c42943
 
3.1%
Other values (50)493726
35.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1386845
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e142984
 
10.3%
s126069
 
9.1%
125300
 
9.0%
r110268
 
8.0%
i87155
 
6.3%
o77105
 
5.6%
a65051
 
4.7%
y61342
 
4.4%
n54902
 
4.0%
c42943
 
3.1%
Other values (50)493726
35.6%

index_code
Categorical

High correlation 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
A
26001 
D
15149 
F
12553 
H
12007 
I
9214 
Other values (5)
30618 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters105542
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
A26001
24.6%
D15149
14.4%
F12553
11.9%
H12007
11.4%
I9214
 
8.7%
G8875
 
8.4%
C6961
 
6.6%
B6775
 
6.4%
J4615
 
4.4%
S3392
 
3.2%

Length

2025-11-20T16:47:39.405869image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-20T16:47:39.602372image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
a26001
24.6%
d15149
14.4%
f12553
11.9%
h12007
11.4%
i9214
 
8.7%
g8875
 
8.4%
c6961
 
6.6%
b6775
 
6.4%
j4615
 
4.4%
s3392
 
3.2%

Most occurring characters

ValueCountFrequency (%)
A26001
24.6%
D15149
14.4%
F12553
11.9%
H12007
11.4%
I9214
 
8.7%
G8875
 
8.4%
C6961
 
6.6%
B6775
 
6.4%
J4615
 
4.4%
S3392
 
3.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)105542
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A26001
24.6%
D15149
14.4%
F12553
11.9%
H12007
11.4%
I9214
 
8.7%
G8875
 
8.4%
C6961
 
6.6%
B6775
 
6.4%
J4615
 
4.4%
S3392
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)105542
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A26001
24.6%
D15149
14.4%
F12553
11.9%
H12007
11.4%
I9214
 
8.7%
G8875
 
8.4%
C6961
 
6.6%
B6775
 
6.4%
J4615
 
4.4%
S3392
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)105542
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A26001
24.6%
D15149
14.4%
F12553
11.9%
H12007
11.4%
I9214
 
8.7%
G8875
 
8.4%
C6961
 
6.6%
B6775
 
6.4%
J4615
 
4.4%
S3392
 
3.2%

index_name
Categorical

High correlation 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Ladieswear
26001 
Divided
15149 
Menswear
12553 
Children Sizes 92-140
12007 
Children Sizes 134-170
9214 
Other values (5)
30618 

Length

Max length30
Median length21
Mean length13.761725
Min length5

Characters and Unicode

Total characters1452440
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLadieswear
2nd rowLadieswear
3rd rowLadieswear
4th rowLingeries/Tights
5th rowLingeries/Tights

Common Values

ValueCountFrequency (%)
Ladieswear26001
24.6%
Divided15149
14.4%
Menswear12553
11.9%
Children Sizes 92-14012007
11.4%
Children Sizes 134-1709214
 
8.7%
Baby Sizes 50-988875
 
8.4%
Ladies Accessories6961
 
6.6%
Lingeries/Tights6775
 
6.4%
Children Accessories, Swimwear4615
 
4.4%
Sport3392
 
3.2%

Length

2025-11-20T16:47:39.830313image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-20T16:47:40.036995image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
sizes30096
16.5%
ladieswear26001
14.3%
children25836
14.2%
divided15149
8.3%
menswear12553
6.9%
92-14012007
 
6.6%
accessories11576
 
6.4%
134-1709214
 
5.1%
50-988875
 
4.9%
baby8875
 
4.9%
Other values (4)21743
12.0%

Most occurring characters

ValueCountFrequency (%)
e196467
 
13.5%
i155708
 
10.7%
s123889
 
8.5%
r90748
 
6.2%
d89096
 
6.1%
a85006
 
5.9%
76383
 
5.3%
w47784
 
3.3%
n45164
 
3.1%
L39737
 
2.7%
Other values (31)502458
34.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)1452440
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e196467
 
13.5%
i155708
 
10.7%
s123889
 
8.5%
r90748
 
6.2%
d89096
 
6.1%
a85006
 
5.9%
76383
 
5.3%
w47784
 
3.3%
n45164
 
3.1%
L39737
 
2.7%
Other values (31)502458
34.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1452440
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e196467
 
13.5%
i155708
 
10.7%
s123889
 
8.5%
r90748
 
6.2%
d89096
 
6.1%
a85006
 
5.9%
76383
 
5.3%
w47784
 
3.3%
n45164
 
3.1%
L39737
 
2.7%
Other values (31)502458
34.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1452440
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e196467
 
13.5%
i155708
 
10.7%
s123889
 
8.5%
r90748
 
6.2%
d89096
 
6.1%
a85006
 
5.9%
76383
 
5.3%
w47784
 
3.3%
n45164
 
3.1%
L39737
 
2.7%
Other values (31)502458
34.6%

index_group_no
Categorical

High correlation 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
1
39737 
4
34711 
2
15149 
3
12553 
26
 
3392

Length

Max length2
Median length1
Mean length1.0321389
Min length1

Characters and Unicode

Total characters108934
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
139737
37.7%
434711
32.9%
215149
 
14.4%
312553
 
11.9%
263392
 
3.2%

Length

2025-11-20T16:47:40.305275image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-20T16:47:40.449255image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
139737
37.7%
434711
32.9%
215149
 
14.4%
312553
 
11.9%
263392
 
3.2%

Most occurring characters

ValueCountFrequency (%)
139737
36.5%
434711
31.9%
218541
17.0%
312553
 
11.5%
63392
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)108934
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
139737
36.5%
434711
31.9%
218541
17.0%
312553
 
11.5%
63392
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)108934
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
139737
36.5%
434711
31.9%
218541
17.0%
312553
 
11.5%
63392
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)108934
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
139737
36.5%
434711
31.9%
218541
17.0%
312553
 
11.5%
63392
 
3.1%

index_group_name
Categorical

High correlation 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Ladieswear
39737 
Baby/Children
34711 
Divided
15149 
Menswear
12553 
Sport
 
3392

Length

Max length13
Median length10
Mean length10.157473
Min length5

Characters and Unicode

Total characters1072040
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLadieswear
2nd rowLadieswear
3rd rowLadieswear
4th rowLadieswear
5th rowLadieswear

Common Values

ValueCountFrequency (%)
Ladieswear39737
37.7%
Baby/Children34711
32.9%
Divided15149
 
14.4%
Menswear12553
 
11.9%
Sport3392
 
3.2%

Length

2025-11-20T16:47:40.691472image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-20T16:47:40.849422image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
ladieswear39737
37.7%
baby/children34711
32.9%
divided15149
 
14.4%
menswear12553
 
11.9%
sport3392
 
3.2%

Most occurring characters

ValueCountFrequency (%)
e154440
14.4%
a126738
11.8%
d104746
 
9.8%
i104746
 
9.8%
r90393
 
8.4%
w52290
 
4.9%
s52290
 
4.9%
n47264
 
4.4%
L39737
 
3.7%
b34711
 
3.2%
Other values (13)264685
24.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)1072040
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e154440
14.4%
a126738
11.8%
d104746
 
9.8%
i104746
 
9.8%
r90393
 
8.4%
w52290
 
4.9%
s52290
 
4.9%
n47264
 
4.4%
L39737
 
3.7%
b34711
 
3.2%
Other values (13)264685
24.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1072040
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e154440
14.4%
a126738
11.8%
d104746
 
9.8%
i104746
 
9.8%
r90393
 
8.4%
w52290
 
4.9%
s52290
 
4.9%
n47264
 
4.4%
L39737
 
3.7%
b34711
 
3.2%
Other values (13)264685
24.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1072040
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e154440
14.4%
a126738
11.8%
d104746
 
9.8%
i104746
 
9.8%
r90393
 
8.4%
w52290
 
4.9%
s52290
 
4.9%
n47264
 
4.4%
L39737
 
3.7%
b34711
 
3.2%
Other values (13)264685
24.7%

section_no
Real number (ℝ)

High correlation 

Distinct57
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.664219
Minimum2
Maximum97
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:41.092856image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile6
Q120
median46
Q361
95-th percentile77
Maximum97
Range95
Interquartile range (IQR)41

Descriptive statistics

Standard deviation23.260105
Coefficient of variation (CV)0.54518999
Kurtosis-1.1000683
Mean42.664219
Median Absolute Deviation (MAD)20
Skewness-0.084535432
Sum4502867
Variance541.03248
MonotonicityNot monotonic
2025-11-20T16:47:41.304942image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
157295
 
6.9%
537124
 
6.7%
444932
 
4.7%
764469
 
4.2%
773899
 
3.7%
613598
 
3.4%
793490
 
3.3%
113376
 
3.2%
463328
 
3.2%
663270
 
3.1%
Other values (47)60761
57.6%
ValueCountFrequency (%)
22337
 
2.2%
43
 
< 0.1%
51894
 
1.8%
62725
 
2.6%
82266
 
2.1%
113376
3.2%
141270
 
1.2%
157295
6.9%
161581
 
1.5%
171
 
< 0.1%
ValueCountFrequency (%)
97559
 
0.5%
82682
 
0.6%
8035
 
< 0.1%
793490
3.3%
773899
3.7%
764469
4.2%
722034
1.9%
7126
 
< 0.1%
70280
 
0.3%
663270
3.1%
Distinct56
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:41.550528image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length30
Median length22
Mean length16.743069
Min length4

Characters and Unicode

Total characters1767097
Distinct characters48
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWomens Everyday Basics
2nd rowWomens Everyday Basics
3rd rowWomens Everyday Basics
4th rowWomens Lingerie
5th rowWomens Lingerie
ValueCountFrequency (%)
womens33662
 
12.8%
17323
 
6.6%
kids15153
 
5.8%
collection14419
 
5.5%
divided14275
 
5.4%
baby10551
 
4.0%
girl10128
 
3.9%
accessories9735
 
3.7%
everyday8876
 
3.4%
basics8828
 
3.4%
Other values (49)120028
45.6%
2025-11-20T16:47:41.973932image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e182527
 
10.3%
157436
 
8.9%
s142303
 
8.1%
i130588
 
7.4%
o123340
 
7.0%
n99911
 
5.7%
r93569
 
5.3%
a92150
 
5.2%
l72523
 
4.1%
d67367
 
3.8%
Other values (38)605383
34.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)1767097
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e182527
 
10.3%
157436
 
8.9%
s142303
 
8.1%
i130588
 
7.4%
o123340
 
7.0%
n99911
 
5.7%
r93569
 
5.3%
a92150
 
5.2%
l72523
 
4.1%
d67367
 
3.8%
Other values (38)605383
34.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1767097
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e182527
 
10.3%
157436
 
8.9%
s142303
 
8.1%
i130588
 
7.4%
o123340
 
7.0%
n99911
 
5.7%
r93569
 
5.3%
a92150
 
5.2%
l72523
 
4.1%
d67367
 
3.8%
Other values (38)605383
34.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1767097
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e182527
 
10.3%
157436
 
8.9%
s142303
 
8.1%
i130588
 
7.4%
o123340
 
7.0%
n99911
 
5.7%
r93569
 
5.3%
a92150
 
5.2%
l72523
 
4.1%
d67367
 
3.8%
Other values (38)605383
34.3%

garment_group_no
Real number (ℝ)

High correlation 

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1010.4383
Minimum1001
Maximum1025
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size824.7 KiB
2025-11-20T16:47:42.147664image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile1002
Q11005
median1009
Q31017
95-th percentile1020
Maximum1025
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.7310232
Coefficient of variation (CV)0.0066614886
Kurtosis-1.287045
Mean1010.4383
Median Absolute Deviation (MAD)6
Skewness0.31875162
Sum1.0664368 × 108
Variance45.306673
MonotonicityNot monotonic
2025-11-20T16:47:42.341340image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
100521445
20.3%
101911519
10.9%
10028126
 
7.7%
10037490
 
7.1%
10177441
 
7.1%
10096727
 
6.4%
10105838
 
5.5%
10205145
 
4.9%
10134874
 
4.6%
10074501
 
4.3%
Other values (11)22436
21.3%
ValueCountFrequency (%)
10013873
 
3.7%
10028126
 
7.7%
10037490
 
7.1%
100521445
20.3%
10061965
 
1.9%
10074501
 
4.3%
1008908
 
0.9%
10096727
 
6.4%
10105838
 
5.5%
10112116
 
2.0%
ValueCountFrequency (%)
10251559
 
1.5%
10231061
 
1.0%
10212272
 
2.2%
10205145
4.9%
101911519
10.9%
10182787
 
2.6%
10177441
7.1%
10163100
 
2.9%
10141541
 
1.5%
10134874
4.6%

garment_group_name
Categorical

High correlation 

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size824.7 KiB
Jersey Fancy
21445 
Accessories
11519 
Jersey Basic
8126 
Knitwear
7490 
Under-, Nightwear
7441 
Other values (16)
49521 

Length

Max length29
Median length17
Mean length10.951811
Min length5

Characters and Unicode

Total characters1155876
Distinct characters40
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJersey Basic
2nd rowJersey Basic
3rd rowJersey Basic
4th rowUnder-, Nightwear
5th rowUnder-, Nightwear

Common Values

ValueCountFrequency (%)
Jersey Fancy21445
20.3%
Accessories11519
10.9%
Jersey Basic8126
 
7.7%
Knitwear7490
 
7.1%
Under-, Nightwear7441
 
7.1%
Trousers6727
 
6.4%
Blouses5838
 
5.5%
Shoes5145
 
4.9%
Dresses Ladies4874
 
4.6%
Outdoor4501
 
4.3%
Other values (11)22436
21.3%

Length

2025-11-20T16:47:42.589033image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
jersey29571
18.3%
fancy21445
13.3%
accessories11519
 
7.1%
trousers9827
 
6.1%
basic8126
 
5.0%
knitwear7490
 
4.6%
under7441
 
4.6%
nightwear7441
 
4.6%
blouses5838
 
3.6%
shoes5145
 
3.2%
Other values (20)47761
29.6%

Most occurring characters

ValueCountFrequency (%)
e160751
13.9%
s150245
13.0%
r108764
 
9.4%
i59052
 
5.1%
a57461
 
5.0%
n57297
 
5.0%
56062
 
4.9%
c55942
 
4.8%
y54946
 
4.8%
o51000
 
4.4%
Other values (30)344356
29.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)1155876
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e160751
13.9%
s150245
13.0%
r108764
 
9.4%
i59052
 
5.1%
a57461
 
5.0%
n57297
 
5.0%
56062
 
4.9%
c55942
 
4.8%
y54946
 
4.8%
o51000
 
4.4%
Other values (30)344356
29.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1155876
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e160751
13.9%
s150245
13.0%
r108764
 
9.4%
i59052
 
5.1%
a57461
 
5.0%
n57297
 
5.0%
56062
 
4.9%
c55942
 
4.8%
y54946
 
4.8%
o51000
 
4.4%
Other values (30)344356
29.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1155876
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e160751
13.9%
s150245
13.0%
r108764
 
9.4%
i59052
 
5.1%
a57461
 
5.0%
n57297
 
5.0%
56062
 
4.9%
c55942
 
4.8%
y54946
 
4.8%
o51000
 
4.4%
Other values (30)344356
29.8%
Distinct43404
Distinct (%)41.3%
Missing416
Missing (%)0.4%
Memory size824.7 KiB
2025-11-20T16:47:42.946484image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length764
Median length468
Mean length142.1619
Min length11

Characters and Unicode

Total characters14944912
Distinct characters98
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21430 ?
Unique (%)20.4%

Sample

1st rowJersey top with narrow shoulder straps.
2nd rowJersey top with narrow shoulder straps.
3rd rowJersey top with narrow shoulder straps.
4th rowMicrofibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.
5th rowMicrofibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.
ValueCountFrequency (%)
and160065
 
6.4%
a151693
 
6.0%
with150703
 
6.0%
the135045
 
5.4%
in105374
 
4.2%
at80688
 
3.2%
back36807
 
1.5%
front36244
 
1.4%
soft35579
 
1.4%
waist34284
 
1.4%
Other values (5000)1586260
63.1%
2025-11-20T16:47:43.553918image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2407661
16.1%
e1318549
 
8.8%
t1241876
 
8.3%
a1029247
 
6.9%
n910904
 
6.1%
i876105
 
5.9%
s828095
 
5.5%
o718446
 
4.8%
r618196
 
4.1%
d602822
 
4.0%
Other values (88)4393011
29.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)14944912
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2407661
16.1%
e1318549
 
8.8%
t1241876
 
8.3%
a1029247
 
6.9%
n910904
 
6.1%
i876105
 
5.9%
s828095
 
5.5%
o718446
 
4.8%
r618196
 
4.1%
d602822
 
4.0%
Other values (88)4393011
29.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)14944912
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2407661
16.1%
e1318549
 
8.8%
t1241876
 
8.3%
a1029247
 
6.9%
n910904
 
6.1%
i876105
 
5.9%
s828095
 
5.5%
o718446
 
4.8%
r618196
 
4.1%
d602822
 
4.0%
Other values (88)4393011
29.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)14944912
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2407661
16.1%
e1318549
 
8.8%
t1241876
 
8.3%
a1029247
 
6.9%
n910904
 
6.1%
i876105
 
5.9%
s828095
 
5.5%
o718446
 
4.8%
r618196
 
4.1%
d602822
 
4.0%
Other values (88)4393011
29.4%

Interactions

2025-11-20T16:47:28.448668image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:12.585901image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:14.398170image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:16.093726image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:17.907792image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:19.644959image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:21.400167image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:23.347677image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:25.032696image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:26.725730image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:28.624888image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:12.856681image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:14.538845image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:16.245610image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:18.089982image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:19.825506image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:21.550185image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:23.522304image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:25.207284image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:26.911480image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:28.805722image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:13.020844image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:14.709400image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:16.424357image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:18.234830image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:20.009651image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:21.926113image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:23.697584image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:25.345517image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:27.088180image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:29.007179image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:13.211445image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:14.899621image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:16.618860image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:18.435040image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:20.195710image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:22.099225image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:23.889100image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:25.535885image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:27.249066image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:29.190524image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:13.354968image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:15.047631image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:16.808891image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:18.618057image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:20.342289image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:22.292010image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:24.034540image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:25.713078image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:27.428496image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:29.334548image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:13.523978image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:15.222847image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:17.007099image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:18.802500image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:20.514592image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:22.446340image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:24.206382image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:25.893255image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:27.609785image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:29.527829image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:13.715116image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:15.414727image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:17.205916image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:18.993721image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:20.708955image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:22.641800image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:24.397278image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:26.053022image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:27.805526image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:29.705160image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:13.900054image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:15.590120image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:17.388240image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:19.137847image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:20.850729image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:22.829313image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:24.536852image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:26.223213image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:27.946160image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:29.846786image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:14.049197image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:15.733082image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:17.538741image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:19.316166image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:21.024198image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:23.009583image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:24.710659image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:26.393361image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:28.119847image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:30.029358image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:14.222789image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:15.913280image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:17.721573image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:19.494579image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:21.202569image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:23.193317image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:24.890704image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:26.536875image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2025-11-20T16:47:28.303275image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Correlations

2025-11-20T16:47:43.708686image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
article_idcolour_group_codecolour_group_namedepartment_nogarment_group_namegarment_group_nographical_appearance_namegraphical_appearance_noindex_codeindex_group_nameindex_group_noindex_nameperceived_colour_master_idperceived_colour_master_nameperceived_colour_value_idperceived_colour_value_nameproduct_codeproduct_group_nameproduct_type_nosection_no
article_id1.000-0.0420.072-0.0730.0910.0100.063-0.0000.0700.0830.0830.0700.0290.063-0.0540.0321.0000.076-0.040-0.042
colour_group_code-0.0421.0001.0000.0800.135-0.0170.1650.0500.1200.1430.1430.120-0.3390.9300.0090.262-0.0420.0970.077-0.004
colour_group_name0.0721.0001.0000.1760.1570.1600.1940.7340.2160.2120.2120.2160.8440.8450.8140.8140.0720.1280.1410.172
department_no-0.0730.0800.1761.0000.438-0.0540.161-0.0970.6600.6500.6500.660-0.0410.1560.0070.119-0.0730.333-0.0110.314
garment_group_name0.0910.1350.1570.4381.0001.0000.2440.0520.4580.3350.3350.4580.1640.1320.1390.1390.0910.5400.4390.401
garment_group_no0.010-0.0170.160-0.0541.0001.0000.2550.0580.3560.2140.2140.356-0.0240.1380.0270.0890.0100.540-0.0530.182
graphical_appearance_name0.0630.1650.1940.1610.2440.2551.0001.0000.1890.2130.2130.1890.1810.1440.3050.3050.0630.1730.1150.189
graphical_appearance_no-0.0000.0500.734-0.0970.0520.0581.0001.0000.0180.0140.0140.018-0.0960.1470.0190.734-0.0000.0000.013-0.026
index_code0.0700.1200.2160.6600.4580.3560.1890.0181.0001.0001.0001.0000.1550.1870.1180.1180.0700.3760.3110.642
index_group_name0.0830.1430.2120.6500.3350.2140.2130.0141.0001.0001.0001.0000.1340.1820.1020.1020.0830.1590.0770.764
index_group_no0.0830.1430.2120.6500.3350.2140.2130.0141.0001.0001.0001.0000.1340.1820.1020.1020.0830.1590.0770.764
index_name0.0700.1200.2160.6600.4580.3560.1890.0181.0001.0001.0001.0000.1550.1870.1180.1180.0700.3760.3110.642
perceived_colour_master_id0.029-0.3390.844-0.0410.164-0.0240.181-0.0960.1550.1340.1340.1551.0001.000-0.0380.3550.0290.139-0.0870.000
perceived_colour_master_name0.0630.9300.8450.1560.1320.1380.1440.1470.1870.1820.1820.1871.0001.0000.5940.5940.0630.1130.1310.147
perceived_colour_value_id-0.0540.0090.8140.0070.1390.0270.3050.0190.1180.1020.1020.118-0.0380.5941.0001.000-0.0540.102-0.029-0.005
perceived_colour_value_name0.0320.2620.8140.1190.1390.0890.3050.7340.1180.1020.1020.1180.3550.5941.0001.0000.0320.1020.0600.086
product_code1.000-0.0420.072-0.0730.0910.0100.063-0.0000.0700.0830.0830.0700.0290.063-0.0540.0321.0000.076-0.040-0.042
product_group_name0.0760.0970.1280.3330.5400.5400.1730.0000.3760.1590.1590.3760.1390.1130.1020.1020.0761.0000.7110.271
product_type_no-0.0400.0770.141-0.0110.439-0.0530.1150.0130.3110.0770.0770.311-0.0870.131-0.0290.060-0.0400.7111.0000.027
section_no-0.042-0.0040.1720.3140.4010.1820.189-0.0260.6420.7640.7640.6420.0000.147-0.0050.086-0.0420.2710.0271.000

Missing values

2025-11-20T16:47:30.556191image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
A simple visualization of nullity by column.
2025-11-20T16:47:31.254134image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

article_idproduct_codeprod_nameproduct_type_noproduct_type_nameproduct_group_namegraphical_appearance_nographical_appearance_namecolour_group_codecolour_group_nameperceived_colour_value_idperceived_colour_value_nameperceived_colour_master_idperceived_colour_master_namedepartment_nodepartment_nameindex_codeindex_nameindex_group_noindex_group_namesection_nosection_namegarment_group_nogarment_group_namedetail_desc
0108775015108775Strap top253Vest topGarment Upper body1010016Solid9Black4Dark5Black1676Jersey BasicALadieswear1Ladieswear16Womens Everyday Basics1002Jersey BasicJersey top with narrow shoulder straps.
1108775044108775Strap top253Vest topGarment Upper body1010016Solid10White3Light9White1676Jersey BasicALadieswear1Ladieswear16Womens Everyday Basics1002Jersey BasicJersey top with narrow shoulder straps.
2108775051108775Strap top (1)253Vest topGarment Upper body1010017Stripe11Off White1Dusty Light9White1676Jersey BasicALadieswear1Ladieswear16Womens Everyday Basics1002Jersey BasicJersey top with narrow shoulder straps.
3110065001110065OP T-shirt (Idro)306BraUnderwear1010016Solid9Black4Dark5Black1339Clean LingerieBLingeries/Tights1Ladieswear61Womens Lingerie1017Under-, NightwearMicrofibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.
4110065002110065OP T-shirt (Idro)306BraUnderwear1010016Solid10White3Light9White1339Clean LingerieBLingeries/Tights1Ladieswear61Womens Lingerie1017Under-, NightwearMicrofibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.
5110065011110065OP T-shirt (Idro)306BraUnderwear1010016Solid12Light Beige1Dusty Light11Beige1339Clean LingerieBLingeries/Tights1Ladieswear61Womens Lingerie1017Under-, NightwearMicrofibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.
611156500111156520 den 1p Stockings304Underwear TightsSocks & Tights1010016Solid9Black4Dark5Black3608Tights basicBLingeries/Tights1Ladieswear62Womens Nightwear, Socks & Tigh1021Socks and TightsSemi shiny nylon stockings with a wide, reinforced trim at the top. Use with a suspender belt. 20 denier.
711156500311156520 den 1p Stockings302SocksSocks & Tights1010016Solid13Beige2Medium Dusty11Beige3608Tights basicBLingeries/Tights1Ladieswear62Womens Nightwear, Socks & Tigh1021Socks and TightsSemi shiny nylon stockings with a wide, reinforced trim at the top. Use with a suspender belt. 20 denier.
8111586001111586Shape Up 30 den 1p Tights273Leggings/TightsGarment Lower body1010016Solid9Black4Dark5Black3608Tights basicBLingeries/Tights1Ladieswear62Womens Nightwear, Socks & Tigh1021Socks and TightsTights with built-in support to lift the bottom. Black in 30 denier and light amber in 15 denier.
9111593001111593Support 40 den 1p Tights304Underwear TightsSocks & Tights1010016Solid9Black4Dark5Black3608Tights basicBLingeries/Tights1Ladieswear62Womens Nightwear, Socks & Tigh1021Socks and TightsSemi shiny tights that shape the tummy, thighs and calves while also encouraging blood circulation in the legs. Elasticated waist.
article_idproduct_codeprod_nameproduct_type_noproduct_type_nameproduct_group_namegraphical_appearance_nographical_appearance_namecolour_group_codecolour_group_nameperceived_colour_value_idperceived_colour_value_nameperceived_colour_master_idperceived_colour_master_namedepartment_nodepartment_nameindex_codeindex_nameindex_group_noindex_group_namesection_nosection_namegarment_group_nogarment_group_namedetail_desc
105532949594001949594LOGG Elvis jogger.272TrousersGarment Lower body1010016Solid8Dark Grey4Dark-1Unknown1919JerseyALadieswear1Ladieswear2H&M+1005Jersey FancyJoggers in soft sweatshirt fabric with an elasticated, drawstring waist, diagonal side pockets and slim legs with ribbed hems.
105533950449002950449Compact brush Fancy78Other accessoriesAccessories1010016Solid50Other Pink5Bright4Pink4313Girls Small Acc/BagsJChildren Accessories, Swimwear4Baby/Children43Kids Accessories, Swimwear & D1019AccessoriesSmall, folding hair brush with a rhinestone-decorated lid that has a mirror inside. Diameter 6.5 cm.
105534952267001952267Heavy plain overknee tights 1p304Underwear TightsSocks & Tights1010013Other pattern9Black4Dark5Black3608Tights basicBLingeries/Tights1Ladieswear62Womens Nightwear, Socks & Tigh1021Socks and TightsFine-knit tights with an elasticated waist that are thinner at the top and more opaque at the bottom giving them the appearance of over-the-knee socks.
105535952937003952937Jets dress265DressGarment Full body1010001All over pattern13Beige2Medium Dusty1Mole1641JerseyALadieswear1Ladieswear18Womens Trend1005Jersey FancyFitted, calf-length dress in viscose jersey with a stand-up collar and concealed zip at the back. Double layer at the top with wrapover, draped sections, close-fitting, extra-long sleeves and an asymmetric skirt with a high slit in one side. Lined.
105536952938001952938Elton top254TopGarment Upper body1010001All over pattern13Beige2Medium Dusty1Mole1641JerseyALadieswear1Ladieswear18Womens Trend1005Jersey FancyFitted top in jersey with a round neckline and extra-long sleeves. Additional draped layer at the front.
1055379534500019534505pk regular Placement1302SocksSocks & Tights1010014Placement print9Black4Dark5Black7188Socks BinFMenswear3Menswear26Men Underwear1021Socks and TightsSocks in a fine-knit cotton blend with a small motif at the top and elasticated tops.
105538953763001953763SPORT Malaga tank253Vest topGarment Upper body1010016Solid9Black4Dark5Black1919JerseyALadieswear1Ladieswear2H&M+1005Jersey FancyLoose-fitting sports vest top in ribbed fast-drying functional fabric made from recycled polyester with a racer back and rounded hem.
105539956217002956217Cartwheel dress265DressGarment Full body1010016Solid9Black4Dark5Black1641JerseyALadieswear1Ladieswear18Womens Trend1005Jersey FancyShort, A-line dress in jersey with a round neckline and V-shaped opening at the front with narrow ties. Long, voluminous raglan sleeves and wide cuffs with covered buttons.
105540957375001957375CLAIRE HAIR CLAW72Hair clipAccessories1010016Solid9Black4Dark5Black3946Small AccessoriesDDivided2Divided52Divided Accessories1019AccessoriesLarge plastic hair claw.
105541959461001959461Lounge dress265DressGarment Full body1010016Solid11Off White1Dusty Light9White1641JerseyALadieswear1Ladieswear18Womens Trend1005Jersey FancyCalf-length dress in ribbed jersey made from a cotton blend. Low-cut V-neck at the back, dropped shoulders and long, wide sleeves that taper to the cuffs. Unlined.